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Tradit-DUal tests fail into two categories, both of 



whi^h have several advantages and disadvantages that need to be 
considered when determining the type of test to use. 
Constructed-response tests, such as essay tests, ask students to 
construct their own responses. Thus, students are required not only 
to recall but to organize and often apply knowledge. On the other 
hand, selected-response tests, such as multiple choice tests, ask 
students to select an answer between or among alternatives. While 
questions for constructed-response tests are relatively easy to 
prepare, they are much more difficult to grade and often contain 
relatively few questions. One of the advantages to 
constructed-response tests is that responses are less affected by 
guessing, and clues about students' thought processes can be 
provided. Selected-^response tests require much more time to create, 
but scoring is much easier. One major advantage of these tests is for 
measuring knowledge of specific facts. Essay and written retellings 
are the most common of the constructed-response item types. Other 
types of constructed^-response test are the cloze, completion, and 
short answer items. Special caution should be taken when using cloze 
tests to measure reading ability, since the reading act itself seems 
to be disrupted by cloze testing. Selected-response items include 
true/false or alternate response, matching, and multiple choice. 
While there are several basic problems and limitations surrounding 
all types of assessments, many problems can be attributed not just to 
the test itself, but to misuse of the test. (Twenty references are 
attached.) (RS) 
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ASSESSMENT: ALL TESTS ARE NOT CREATED EQUALLY 

Testing continues to be an important, yet controversial topic to most 
educators. The public and private sectors are demanding more accountability 
through standardized testing from the public schools. At the same time, experts 
are still arguing about issues such as text bias, ambiguity, and even the very 
validity of tests. Standardized test usage continues to grow despite these 
debates. Teacher-made tests also remain an integral piece of the assessment 
of student's abilities in most classrooms. 

While new and hopefully better assessments are being developed, the 
"traditional" forms of tests are still being used by a majority of classroom 
teachers. This article reviews some fundamental elements of traditional tests in 
an effort to clarify some of the issues surrounding them, so that teachers may 
select and create tests that are appropriate to their goals and the knowledge that 
they want to measure. First, an overview will be presented of the two major 
categories of tests. Next, specific types of test items will be examined in more 
detail. 

Constructed-response vs. Selected-response Tests 

Traditional tests fall into two major catagories. Both have several 
advantages and disadvaritages that need to be considered when determining 
which type of test to use. Constructed-response tests, such as essay tests, ask 
individuJs to construct \he\r own responses. Thus, students are required not 
only to recall, but to organize and often apply knowledge. On the other hand, 
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selected-response tests, such as muitiple choice tests, ask individuals to select 
an answer between or among alternatives. 

There are many things to consider when choosing between constnjcted- 
response tests and selected-response tests. While questions for a constructed- 
response test are relatively easy to prepare, they are much more difficult to 
grade. A considerable amount of time must be spent in creating clear criteria, 
such as scoring rubrics, for assessing the answers. Likewise, scoring the tests 
takes considerable time. The scoring of constructed-response test items 
involves at least some subjectivity, even when criteria have been carefully 
established. Another disadvantage is that these tests contain relatively few 
questions, which in some cases prevents adequate sampling of the subject 
matter. 

A cumulative listing from historic and contemporary test and 
measurement specialists (Ahmann & Glock, 1975; Cook, 1950; Cunningham, 
1986; Ebel & Frisbie, 1986; Gronlund, 1982; Mehrens & Lehmann, 1984; Payne, 
1974; Popham, 1978; Roid & Haladyna, 1982; Thorndike & Hagen, 1969; 
Wesman, 1 971 ) suggests advantages and disadvantages to the 
constructed-response test items. The first advantage is that students do 
construct their own answers. Responses are less affected by guessing, and 
clues about students' thought processes can be provided. There is another 
important factor to consider which can be an advantage, or a disadvantage, 
depending on the purpose of giving the test. The scores given on constnjcted- 
response tests are directly related to how well the student can write, adding one 
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more factor into what is actually being measured. 

Despite the complexities of scoring, the use of the constnjcted-recponse 
test is rising, f^any feel that the advantages far outweigh the disadvantages. 
With the focus on process over product, and the push for more-and-more writing 
in the classroom, test developers are certain to continue the pursuit of refining 
and redesigning constajcted-response tests. 

There are also trade-offs when a selective-response test is used. These 
tests require much more time to create, but scoring them is relatively quick. 
Many people favor selective-response tests because they believe they are 
completely objective, but this may be erroneous. Many people favor selected- 
response tests on the assumption that they are totally objective. However, the 
scores on a selected-response test can also be considered as subjective since 
"right" and "wrong" answers are pre-determlned by the test developer (Ebel, 
1 979, pp.1 00-1 01 ). This a weakness of selected-response that is often ignored. 

Certainly one major advantage of the selected-response tests is for 
measuring knowledge of specific facts. Selected-response tests allow a broad 
sampling of subject maW^r in a highly-stmctured testing situation. The questions 
can be constnjcted to measure knowledge in any area. The scoring is simple, 
primarily objective, and reliable (Cunningham, 1986; Mehrens & Lehmann, 
1984; Nunnally, 1967; Payne, 1974; Roid & Haladyna, 1982). However, this 
very advantage can also be considered a disadvantage. Many believe that 
these tests do not require much "real" thinking sincf there can only be one 
con^ect answer to questions. These critics believe such tests encourage little 
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more than rote memorization (Bracey. 1930; Haney & Madaus, 1989; Neill & 
Medina, 1989; Valencia & Pearson, 1987). However, when the objective of the 
assessment is to measure knowledge of facts, these tests can provide a 
relatively accurate assessment of such knowledge. 

Thorndike and Hagen (1969, pp. 67-72) state there are theoretical issues 
to consider when choosing what to include in a test. One consideration deals 
with the adequacy of the test in elidting student response. Choosing whether to 
develop a constnjcted-response test or a selected-rosponse test should 
coincide with the purpose of the test. Popham (1978, pp. 44-45) states that for 
measuring knowledge of factual information, the selected-response test Is more 
efficient. The selected- response test is also useful when a high degree of 
specificity is needed, such as tests designed to see if reteaching of facts is 
necessary. However, for measuring originality, the ability to synthesize ideas, 
write effectively, or to solve problems, constructed- response tests are obviously 
better. 

Test item Choice 
Constructed-resDonse Test items 

The types of items associated with constnjcted-response tests include 
essays, written retellings, cloze, completion, and short answer items. 
Essay & WrittPn ReteilinQS 

The most common of the constructed-response item types are the essay 
and written retellings. As can already be inferred, answering a well developed 



ASSESSMENT 

essay question can require application of knowledge, and other forms of higher- 
level thinking, rather than simple recall. Therefore, essay tests, when written 
and scored with care, can provide some evidence of the student's ability to 
apply knowledge. However, a written retelling, though it requires constnjction of 
an answer like the essay, requires simple recall for the most part. Consequently, 
the differences between responses to critical essay questions and written 
retellings are enormous. It must be remembered that success on essay and 
written retelling tests in particular are tied to the student's ability to write. Again, 
this can be considered an advantage or a disadvantage, but it must always be 
remembered when interpreting the results of the tests. 
Cloze ffill-inV Completion and Short Answer Tests. 

Other types of constnjcted-response tests are the cloze, completion, and 
short answer items. While these tests do not rely as heavily on the student's 
ability to write as do the written retelling and essay, it still must be considered 
somewhat of a factor. The amount of infomiation that is required to answer 
these types of questions can vary significantly. They can require little more than 
simple recall if not written with care. 

A special word of oaution is needed for using cloze tests to measure 
reading ability. Powell (1988) and Ashby-Davis (1985) agree that cloze tests 
require quite different thinking processes than other traditional forms of 
assessments. While taking a cloze test, students read slower and reread more 
often. Powell (1988) had students "think-aloud" as they completed reading 
tests. In ■ verbal protocols, the students did not tie in their background 
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knowledge to the passage during a cloze test as much as they did when taking 
multiple choice tests or giving retellings The student's attempts to understand 
the text appeared to be limited to the sentence level rather than the passage 
level. This research suggests that cloze tests may not be a valid measure of 
overall reading performance, since the reading act itself seems to be disrupted 
by cloze test-ng. However, cloze tests may be useful in detemiining a student's 
ability to use context clues. 

Selected-response Test Items 
The types of items assodated with selected-response tests include 
tnje/false or alternate-response, matching and multiple choice. 
True/f^aise Items. 

True/false items require the examinees to determine the truth or falsity of 
a statement. Advantages and disadvantages of tme/false items have been dted 
by authorities in the field of test and measurement (Ahmann & Glock, 1975; 
Cook, 1950; Cunningham, 1986; Ebel & Frisbie, 1986; Mehrens & Lehmann, 
1984; Payne, 1974; Roid & Haladyna, 1982; Swezey, 1981; Thomdike & Hagen, 
1969; Wesman, 1971). Advantages of the true/false item include speed in 
scoring, ease of construction, inclusion of a larger number of items and 
measurement of factual knowledge. There are several disadvantages to 
true/false items. It is very difficult to write good t rue/faise test items. For 
example, Items about controversial material are difficult to write. There are also 
many instances where an answer is not unequivocally true or false; there are 
degrees of correctness. Finally, the fifty-fifty percent chance of getting a 
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question correct by guessing must be ackiiowledged when interpreting the 
scores. 

Matching Items . 

Matching items require students to match items placed in two or more 
columns. Historical and current literature (Ahmann & Block, 1975; Cook, 1950; 
Cu.inlngham. 1386; Ebel & Frisbie. 1986; Mehrens & Lehmann, 1984; Payne, 
1974; Pophan, 1978; Roid & Haladyna, 1982; Swezey, 1981; Thomdike & 
Hagen, 1969; Wesman, 1971) cite the advantages and disadvantages of 
matching items. A matching format offers several advantages. Items are easy to 
constnjct and are more efficient than multiple-choice. Items are economical of 
space and time and are written in a compact form. Questions written as 
matching itemu are reasonably free from guessing. Disadvantages of matching 
items are that they are suitable for measuring assodation only, and they are 
susceptible to clues. Good matching items are also difficult to write. 
Multiple-Choirs Items. 

Multiple-choice items require pupils to select a response from a 
specified number of options. Each multiple-choice item consists of two parts: 
the stem and suggested responses. Test and measurement authorities 
(Ahmann & Glock, 19785; Cook, 1950; Cunningham, 1986; Ebel & Frisbie, 
1986; Mehrens & Lehmann, 1984; Payne, 1974; Popham, 1978; Roid & 
Haladyna, 19S2; Swezey, 1981; Thomdike & Hagen, 1969; Wesman, 1971) 
state thst there are advantages and disadvantages of multiple-choice items. 
Multiple-choice items can be adrpted to a wide variety of material and can 
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measure understanding, discrimination and judgment. They can be scored 
quickly and can provide diagnostic information if the response patterns are 
analyzed. One limitation of the multiple-choice item is that an extra amount of 
tine and skill is required to construct good items. It is difficult to provide three or 
four plausible incorrect responses, and there is a tendency to write only recall 
questions. 

Conclusions 

While there are several basic problems and limitations surrounding all 
types of assessments, many of the problems surrounding them can be attributed 
not just to the test itself, but to the misuse of the test. For example, Information 
about process, or how students came to certain conclusions, can only be 
infenred from all types of tests. In order to really understand where a student's 
thinking went wrong, one must literally ask the student to explain how they came 
up with an answer. Informal assessments such as this are extremely important 
to the overall assessment of ail students. 

We need to be more aware of what different types of tests measure, and 
the valid conclusions we can make from the test scores. Too often tests are 
used to measure something that cannot be measured by that test, and then 
make decisions about curriculum and placement based^ on invalid information. 
Tests In and of themselves cannot give educators all the answers. Uterally all 
tests can only be considered as one sample of a student's ability, and must be 
considered along with other factors for a valid assessment of student progress. 
It would be difficult to find any educator who wouldn't agree that we must find 
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batter assessment methods. Testing has not kept up with advances in 
educational theory. Portfolio assessment and authentic assessment are two of 
the ways that leaders in the field are making strides in improving assessment. 
However, as we are developing new ways to assess students, we must be 
mindful of how we use the ones we already have. 
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